8 research outputs found
Recommended from our members
Word shape analysis for a hybrid recognition system
This paper describes two wholistic recognizers developed for use in a hybrid recognition system. The recognizers use information about the word shape. This information is strongly related to word zoning. One of the recognizers is explicitly limited by the accuracy of the zoning information extraction. The other recognizer is designed so as to avoid this limitation. The recognizers use very simple sets of features and fuzzy set based pattern matching techniques. This not only aims to increase their robustness, but also causes problems with disambiguation of the results. A verification mechanism, using letter alternatives as compound features, is introduced. Letter alternatives are obtained from a segmentation based recognizer coexisting in the hybrid system. Despite some remaining disambiguation problems, wholistic recognizers are found capable of outperforming the segmentation based recognizer. When working together in a hybrid system, the results are significantly higher than that of the individual recognizers. Recognition results are reported and compared
Recommended from our members
Feature extraction: on the importance of zoning information in cursive script recognition
Dynamic cursive script recognition, a hybrid approach
this paper attempts to recognize entire words, but should it fail, it attempts to complete the word by consulting a lexicon. Words with identical beginnings are usually morphologically related. The system selects a similar word which fits the apparent size of the input. Even if the wrong form of the word is chosen, the selection of a related word is preferable to no result or a clearly incorrect one. Both static and dynamic approaches to the recognition could benefit from such word ending postulation
Handwritten word recognition using Web resources and recurrent neural networks
International audienceHandwriting recognition systems usually rely on static dictionaries and language models. Full coverage of these dictionaries is generally not achieved when dealing with unrestricted document corpora due to the presence of Out-Of-Vocabulary (OOV) words. We propose an approach which uses the World Wide Web as a corpus to improve dictionary coverage. We exploit the very large and freely available Wikipedia corpus in order to obtain dynamic dictionaries on the fly. We rely on recurrent neural network (RNN) recognizers, with and without linguistic resources, to detect words that are non-reliably recognized within a word sequence. Such words are labeled as non-anchor words (NAWs) and include OOVs and In-Vocabulary words recognized with low confidence. To recognize a non-anchor word, a dynamic dictionary is built by selecting words from the Web resource based on their string similarity with the NAW image, and their linguistic relevance in the NAW context. Similarity is evaluated by computing the edit distance between the sequence of characters generated by the RNN recognizer exploited as a filler model, and the Wikipedia words. Linguistic relevance is based on an N-gram language model estimated from the Wikipedia corpus. Experiments conducted on aword-segmented version of the publicly available RIMES database show that the proposed approach can improve recognition accuracy compared to systems based on static dictionaries only. The proposed approach shows even better behavior as the proportion of OOVs increases, in terms of both accuracy and dictionary coverage